fix(login): prevent env var auth leak in guest sessions#267
Merged
Conversation
…t sessions Session factories conflated "no override provided" with "explicitly no auth" via `api_key_override or get_api_key()`. When RunpodGraphQLClient set api_key_override=None for require_api_key=False, the `or` fallback picked up RUNPOD_API_KEY from the environment, causing flash login to send a Bearer token instead of acting as a guest. The server then never returned an apiKey. Introduces _UNSET sentinel so api_key_override=None means "send no auth" while omitting the parameter preserves existing credential resolution.
…uth scenarios - Replace mocked get_api_key test with real credential resolution (fresh user) - Add test: credentials file has key + require_api_key=False → no auth header - Add test: both env var and credentials file + require_api_key=False → no auth header
Remove _read_credentials references (no longer exists), RUNPOD_CREDENTIALS_FILE env var usage, and clear=True from patch.dict calls. All tests now use the autouse isolate_credentials_file fixture and write runpod-python profile format ([default] section). Also fix credentials.py to read CREDENTIAL_FILE dynamically via module reference (enables monkeypatch) and catch TOMLDecodeError from get_credentials() for corrupt file handling.
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes flash login guest-session hangs when RUNPOD_API_KEY (env var or stored credentials) is present by ensuring login/guest flows can explicitly disable auth header injection, while preserving default credential resolution for authenticated flows. Also adds migration support for legacy ~/.config/runpod/credentials.toml into the current runpod-python credential location.
Changes:
- Introduce an
_UNSETsentinel for HTTP session factories soapi_key_override=Nonemeans “no auth”, while omitting the arg still resolves credentials viaget_api_key(). - Rework credential handling to wrap runpod-python credential helpers and add legacy XDG credentials migration.
- Update/expand unit tests and add a shared autouse fixture to isolate credential files during tests.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/runpod_flash/core/utils/http.py |
Adds _UNSET sentinel behavior to prevent accidental credential fallback when api_key_override=None. |
src/runpod_flash/core/credentials.py |
Switches to runpod-python credential APIs and adds legacy XDG migration helper. |
src/runpod_flash/cli/commands/login.py |
Calls legacy credential migration during login flow before polling for approval. |
tests/conftest.py |
Updates isolate_credentials_file fixture to monkeypatch runpod-python credential file path and clear RUNPOD_API_KEY. |
tests/unit/test_login_extended.py |
Expands login and GraphQL session tests to cover “no-auth despite env/file creds” scenarios. |
tests/unit/test_login.py |
Updates login-related tests to use the new credential isolation fixture. |
tests/unit/test_credentials_extended.py |
Updates credential/http helper tests for the new config.toml format and explicit None override behavior. |
tests/unit/test_credentials.py |
Updates core credential tests for config.toml format and new isolation approach. |
tests/unit/test_credential_migration.py |
Adds coverage for migrating legacy XDG credentials to the new location. |
Comments suppressed due to low confidence (1)
src/runpod_flash/core/utils/http.py:128
- The docstrings still describe
api_key_overrideas an optional API key that falls back toRUNPOD_API_KEYwhen not provided, but the new semantics differentiate “parameter omitted” vs “explicitly None” to disable auth. Please update theArgs:docs (and the aiohttp bullet list) to explicitly state: omitted => resolve viaget_api_key(),None=> send no auth header.
Args:
timeout: Request timeout in seconds. Defaults to 30.0.
api_key_override: Optional API key to use instead of RUNPOD_API_KEY env var.
Used for propagating API keys from load-balanced to worker endpoints.
Returns:
Configured httpx.AsyncClient with User-Agent and Authorization headers
Example:
async with get_authenticated_httpx_client() as client:
response = await client.post(url, json=data)
# With custom timeout
async with get_authenticated_httpx_client(timeout=60.0) as client:
response = await client.get(url)
# With API key override (for propagation)
async with get_authenticated_httpx_client(api_key_override=context_key) as client:
response = await client.post(url, json=data)
"""
from .user_agent import get_user_agent
headers = {
"User-Agent": get_user_agent(),
"Content-Type": "application/json",
}
api_key = get_api_key() if api_key_override is _UNSET else api_key_override
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
timeout_config = timeout if timeout is not None else 30.0
return httpx.AsyncClient(timeout=timeout_config, headers=headers)
def get_authenticated_requests_session(
api_key_override: Union[Optional[str], object] = _UNSET,
) -> requests.Session:
"""Create requests Session with RunPod authentication and User-Agent.
Automatically includes:
- User-Agent header identifying flash client and version
- Content-Type: application/json
- Authorization header if an api key is available
Provides a centralized place to manage authentication headers for
synchronous RunPod HTTP requests.
Args:
api_key_override: Optional API key to use instead of RUNPOD_API_KEY env var.
Used for propagating API keys from load-balanced to worker endpoints.
Returns:
Configured requests.Session with User-Agent and Authorization headers
Example:
session = get_authenticated_requests_session()
response = session.post(url, json=data, timeout=30.0)
# Remember to close: session.close()
# Or use as context manager
import contextlib
with contextlib.closing(get_authenticated_requests_session()) as session:
response = session.post(url, json=data)
# With API key override (for propagation)
with contextlib.closing(get_authenticated_requests_session(api_key_override=context_key)) as session:
response = session.post(url, json=data)
"""
from .user_agent import get_user_agent
session = requests.Session()
session.headers["User-Agent"] = get_user_agent()
session.headers["Content-Type"] = "application/json"
api_key = get_api_key() if api_key_override is _UNSET else api_key_override
if api_key:
session.headers["Authorization"] = f"Bearer {api_key}"
return session
def get_authenticated_aiohttp_session(
timeout: float = 300.0,
api_key_override: Union[Optional[str], object] = _UNSET,
use_threaded_resolver: bool = True,
) -> ClientSession:
"""Create aiohttp ClientSession with RunPod authentication and User-Agent.
Automatically includes:
- User-Agent header identifying flash client and version
- Content-Type: application/json
- Authorization header if RUNPOD_API_KEY is set or api_key_override provided
- 5-minute default timeout (configurable)
- TCPConnector with ThreadedResolver for DNS resolution (configurable)
Args:
timeout: Total timeout in seconds (default: 300s for GraphQL operations)
api_key_override: Optional API key to use instead of RUNPOD_API_KEY.
Used for propagating API keys from mothership to worker endpoints.
use_threaded_resolver: Whether to use TCPConnector with ThreadedResolver.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Replace .env-based API key setup with flash login as the primary authentication method across README, skeleton template, init command output, and CLI docs. - Quick Start sections now use flash login instead of cp .env.example - Added Authentication section to skeleton template README - Simplified init command post-creation steps - Added flash-login.md CLI documentation - Updated test_init to verify flash login in steps table
- Add _resolve_api_key() with TypeError guard for invalid override types - Wrap get_credentials() in try/except during migration pre-check - Validate existing api_key with isinstance + .strip() before skipping - Fix docstring to match non-interactive auto-migration behavior - Move migration call to after successful auth (not before polling) - Simplify _OLD_XDG_PATHS tuple to _OLD_XDG_PATH constant - Narrow migration except clause from Exception to (OSError, ValueError)
- Replace Rich Table _cells access with console render-to-string - Update migration test patches from _OLD_XDG_PATHS tuple to _OLD_XDG_PATH
zackmckennarunpod
approved these changes
Mar 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
flash loginhangs at "Waiting for authorization..." whenRUNPOD_API_KEYis set (env var or credentials file from a previous login)api_key_override or get_api_key()in all three HTTP session factories conflated "no override provided" with "explicitly no auth" — theorfallback picked up existing credentials and sent a Bearer token, so the server never saw a guest caller_UNSETsentinel default soapi_key_override=None(login flow) means "send no auth" while omitting the parameter preserves existing credential resolutionTest plan
api_key_override=NonewithRUNPOD_API_KEYset — all 3 session factories omit Authorization headerrequire_api_key=Falsewith env var set — GraphQL client session has no auth (exact bug scenario)make quality-checkpasses (85% coverage)RUNPOD_API_KEY, runflash login --force, verify it completesflash login, verify create request succeeds